Freeing vision from frames

نویسندگان

Tobi Delbruck

Patrick Lichtsteiner

چکیده

3 Volume 3, Issue 1, May 006 The notion of a ‘frame’ of video data has become so embedded in computer vision that it is taken for granted. This is natural, given that the only available input devices have always been frame-based—from drum scanners and videcon tubes to CCDs (chargecoupled devices) and CMOS (complimentary metal-oxide semiconductor) imagers. Also, frame-based imagers have undeniable advantages: they use small pixels, are easy to understand, and are compatible with standard output devices. Are frames the way to go for vision problems, or are they just a holdover from video? Frames carry a heavy penalty: frame-based vision is centered on a stroboscopic series of snapshots taken at a constant rate. The pixels are sampled redundantly, over and over, even if they have nothing novel to say. Bandwidth and dynamic range are limited by the identical sampling rate and integration time. When a human composes a static picture, these may not be terrible disadvantages, but for machine vision in unsupervised environments, the disadvantages of limited dynamic range and sampling rate can be extremely important. Over the past decade, a handful of developers have created novel vision sensor devices that adopt the neuromorphic architecture of redundancy-reduced address-event output. (We don’t have room here to discuss imaging devices that don’t reduce redundancy.) Some of these devices abandon frames altogether. Starting from Mahowald’s address-event representation (AER) silicon retina,1 these new devices offer the promise of more effective ways of tackling real-world vision problems. Mahowald’s AER retina was a demonstration of a concept device that was unusable for any real world task—in fact it was necessary to show it something like a flashing LED to see any sensible response. The University of Pennsylvania’s silicon retina2 marked a major advance by incorporating both sustained and transient types of cells with adaptive spatial and temporal filtering, meaning that the space and time constants vary according to the illumination level and spatio-temporal contrast. This functionality is achieved by the use of tightly coupled log-domain current mode circuits. Of all devices built so far, this one comes closest to capturing key adaptive features of biological retinas. However, the price for this functionality is mismatch: the DC firing rates vary by a factor of 1,000, and one-half of the pixels do not spike at all for moderate contrast. In addition, the use of a passive phototransistor current-gain Freeing vision from frames

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Global Health Policies of the EU and its Member States: A Common Vision?

Background This article assesses the global health policies of the European Union (EU) and those of its individual member states. So far EU and public health scholars have paid little heed to this, despite the large budgets involved in this area. While the European Commission has attempted to define the ‘EU role in Global Health’ in 2010, member states are active in the domain of global health ...

متن کامل

A Study on Image Processing Methods for Fruit Classification

Automation of fruit recognition and classification is an interesting application of computer vision. Traditional fruit classification methods have often relied on manual operations based on visual ability and such methods are tedious, time consuming and inconsistent. External shape appearance is the main source for fruit classification. In recent years, computer machine vision and image process...

متن کامل

Displacement monitoring of a Long-Span Arch Railway Bridge using Digital Image Correlation (DIC)

There is an escalating demand for condition monitoring enhancement of transport infrastructures worldwide. Bridges are of vital importance in transportation infrastructure and need such monitoring. In this research, a non-contact vision-based technique called Digital Image Correlation (DIC) was used to calculate the bridge displacements. A high frame rate camera with 4K capability was used for ...

متن کامل

Perceptual adaptation helps us identify faces

Adaptation is a fundamental property of perceptual processing. In low-level vision, it can calibrate perception to current inputs, increasing coding efficiency and enhancing discrimination around the adapted level. Adaptation also occurs in high-level vision, as illustrated by face aftereffects. However, the functional consequences of face adaptation remain uncertain. Here we investigated wheth...

متن کامل